MENU

Here, I want to discuss the main probability distribution (based on my humble knowledge). Probability is the area that I am so fascinated with because there are many applications in several science topics. The principal probability distributions necessary to understand the whole process regarding inference and applying statistical models are Bernoulli, Binomial, Negative-Binomial, Poisson, Normal, and Gamma. Of course, there are many other essential distributions that I am not to discourse here. I will try to explain the support and parameters beyond the idea behind each one.

Bernoulli distribution

The first one is the most famous distribution, is the Bernoulli distribution. Let X a binary random variable with probability density function (PDF) f_{x} . Then, X \sim Ber(p) has PDF

f(x) = \mathrm{P}(X = x) = p^{x} (1-p)^{1-x}

where the support is X \in \{0, 1\} and parametric space is p \in (0, 1). The expected value (mathematical expectation) is \mathrm{E}(X) = p and variance is \mathrm{Var}(X) = p(1 - p).

I will not discuss moments in statistics here where the first moment is mathematical expectations and the second is related to variance. Nevertheless, Wikipedia is a good site where you might start to study more about this topic. I love this concept because everything concerning statistical models is linked to a mean, mainly in generalized linear models (MLG). But it is a topic to see forward.

Bellow, there is a code about fifteen realizations from Bernoulli distribution. You can see that there is a chart, where the x-axis is X = 1 and X = 0, and y-axis is \mathrm{P}(X = 1) and \mathrm{P}(X = 0), respectively. And other propriety that we need to have in mind is \mathrm{P}(X = 1) + \mathrm{P}(X = 0) = 1.

set.seed(123)
value <- seq(1e-06, 0.999999, by = 0.001)
p <- sample(value, size = 15, replace = TRUE)
q <- 1 - p

data0 <- cbind(`X = 1` = p, `X = 0` = q)

barplot(data0, beside = TRUE, main = "Bernoulli distribution",
    xlab = "Realization of the variable",
    ylab = "Probability", col = rainbow(15))

REFERENCES

Agresti, Alan. 2015. Foundations of Linear and Generalized Linear Models. John Wiley & Sons.
DeGroot, Morris H, and Mark J Schervish. 2012. Probability and Statistics. Pearson Education.
R Core Team. 2021. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.
Rigby, Robert A, Mikis D Stasinopoulos, Gillian Z Heller, and Fernanda De Bastiani. 2019. Distributions for Modeling Location, Scale, and Shape: Using GAMLSS in r. CRC press.
Create a front page